{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "db8b26be",
   "metadata": {},
   "source": [
    "\n",
    "# Survival Predictive Analysis and Meta-Analysis\n",
    "\n",
    "This notebook explores survival predictive analysis and meta-analysis techniques, including hazard ratios, survival metrics, forest plots, funnel plots, and meta-regression.\n",
    "\n",
    "## Dataset Information and Links\n",
    "\n",
    "1. **Meta-analysis Example Data:**\n",
    "   - Example datasets for meta-analysis are available in the `metafor` R package and can be downloaded as CSV files: [Metafor Data](https://www.metafor-project.org/doku.php/data/datasets).\n",
    "\n",
    "2. **Survival Analysis Example Data:**\n",
    "   - Kaplan-Meier and hazard ratio data can be synthesized for practice or sourced from clinical studies with survival metrics.\n",
    "\n",
    "### Prerequisites\n",
    "\n",
    "Install the required Python packages:\n",
    "```bash\n",
    "pip install PythonMeta statsmodels numpy pandas matplotlib seaborn\n",
    "```\n",
    "        "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a2370307",
   "metadata": {},
   "source": [
    "\n",
    "## Survival Metrics\n",
    "\n",
    "Survival metrics provide insights into time-dependent clinical outcomes. Key metrics include:\n",
    "- **Overall Survival (OS):** Time from treatment initiation to mortality.\n",
    "- **Disease-Free Survival (DFS):** Time to disease recurrence or mortality.\n",
    "- **Progression-Free Survival (PFS):** Time to disease progression or mortality.\n",
    "- **Recurrence-Free Survival (RFS):** Time to recurrence or mortality.\n",
    "        "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a896b10a",
   "metadata": {},
   "source": [
    "\n",
    "## Kaplan-Meier Curve\n",
    "\n",
    "A Kaplan-Meier curve estimates survival probabilities over time.\n",
    "        "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8b2302d3",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "# Example data for survival\n",
    "time_points = [1, 2, 3, 4, 5]\n",
    "survival_probs = [0.9, 0.8, 0.7, 0.5, 0.3]\n",
    "\n",
    "# Kaplan-Meier plot\n",
    "plt.step(time_points, survival_probs, where=\"post\", label=\"Survival Probability\")\n",
    "plt.xlabel(\"Time (Years)\")\n",
    "plt.ylabel(\"Survival Probability\")\n",
    "plt.title(\"Kaplan-Meier Curve\")\n",
    "plt.legend()\n",
    "plt.show()\n",
    "        "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "62b4e78a",
   "metadata": {},
   "source": [
    "\n",
    "## Meta-Analysis: DerSimonian and Laird Method\n",
    "\n",
    "The DerSimonian and Laird inverse variance method calculates pooled hazard ratios (HRs) and confidence intervals.\n",
    "        "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6e58a8ec",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "import pandas as pd\n",
    "\n",
    "# Example data\n",
    "data = pd.DataFrame({\n",
    "    \"Study\": [\"Study1\", \"Study2\", \"Study3\"],\n",
    "    \"HR\": [0.8, 0.6, 1.2],\n",
    "    \"Variance\": [0.02, 0.03, 0.01]\n",
    "})\n",
    "\n",
    "# Calculate weights\n",
    "data[\"Weight\"] = 1 / data[\"Variance\"]\n",
    "\n",
    "# Pooled HR\n",
    "pooled_hr = np.sum(data[\"HR\"] * data[\"Weight\"]) / np.sum(data[\"Weight\"])\n",
    "\n",
    "# Variance of pooled HR\n",
    "pooled_variance = 1 / np.sum(data[\"Weight\"])\n",
    "\n",
    "print(f\"Pooled HR: {pooled_hr:.2f}\")\n",
    "print(f\"Variance of Pooled HR: {pooled_variance:.2f}\")\n",
    "        "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3f737f94",
   "metadata": {},
   "source": [
    "\n",
    "## Forest Plot\n",
    "\n",
    "Visualize study-level and pooled hazard ratios using a forest plot.\n",
    "        "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ab17c680",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "# Forest plot\n",
    "plt.errorbar(data[\"HR\"], range(len(data)), xerr=np.sqrt(data[\"Variance\"]), fmt=\"o\", label=\"Individual Studies\")\n",
    "plt.axvline(pooled_hr, color=\"red\", linestyle=\"--\", label=\"Pooled HR\")\n",
    "plt.yticks(range(len(data)), data[\"Study\"])\n",
    "plt.xlabel(\"Hazard Ratio\")\n",
    "plt.title(\"Forest Plot\")\n",
    "plt.legend()\n",
    "plt.gca().invert_yaxis()  # Reverse y-axis for readability\n",
    "plt.show()\n",
    "        "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3994324e",
   "metadata": {},
   "source": [
    "\n",
    "## Funnel Plot\n",
    "\n",
    "Assess publication bias by plotting hazard ratios against their standard errors.\n",
    "        "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "74c67c3e",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "# Standard errors\n",
    "data[\"StdError\"] = np.sqrt(data[\"Variance\"])\n",
    "\n",
    "# Funnel plot\n",
    "plt.scatter(data[\"HR\"], 1 / data[\"StdError\"], alpha=0.7)\n",
    "plt.axvline(pooled_hr, color=\"red\", linestyle=\"--\", label=\"Pooled HR\")\n",
    "plt.xlabel(\"Hazard Ratio\")\n",
    "plt.ylabel(\"Precision (1 / StdError)\")\n",
    "plt.title(\"Funnel Plot\")\n",
    "plt.legend()\n",
    "plt.show()\n",
    "        "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4b0d092c",
   "metadata": {},
   "source": [
    "\n",
    "## Meta-Regression\n",
    "\n",
    "Examine the influence of moderators (covariates) on meta-analysis results using weighted regression.\n",
    "        "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e9bdaa91",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "from statsmodels.api import WLS\n",
    "\n",
    "# Example data for meta-regression\n",
    "n_studies = 5\n",
    "effect_sizes = [0.8, 0.6, 1.2, 0.7, 0.9]\n",
    "variances = [0.02, 0.03, 0.01, 0.04, 0.02]\n",
    "weights = 1 / np.array(variances)\n",
    "covariates = [10, 20, 30, 40, 50]\n",
    "\n",
    "# Meta-regression model\n",
    "X = pd.DataFrame({\"Intercept\": 1, \"Covariates\": covariates})\n",
    "y = pd.Series(effect_sizes)\n",
    "model = WLS(y, X, weights=weights).fit()\n",
    "\n",
    "# Print regression summary\n",
    "print(model.summary())\n",
    "        "
   ]
  }
 ],
 "metadata": {},
 "nbformat": 4,
 "nbformat_minor": 5
}
